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Sports are spontaneous generators of stories. Through skill and chance, the script of each game 
is dynamically written in real time by players acting out possible trajectories allowed by a sport’s 
rules. By properly characterizing a given sport’s ecology of ‘game stories’, we are able to capture the 
sport’s capacity for unfolding interesting narratives, in part by contrasting them with random walks. 
Here, we explore the game story space afforded by a data set of 1,310 Australian Football League 
(AFL) score lines. We find that AFL games exhibit a continuous spectrum of stories rather than 
distinct clusters. We show how coarse-graining reveals identifiable motifs ranging from last minute 
comeback wins to one-sided blowouts. Through an extensive comparison with biased random walks, 
we show that real AFL games deliver a broader array of motifs than null models, and we provide 
consequent insights into the narrative appeal of real games. 

PACS numbers: 89.65.-s, 89.20.-a, 05.40.Jc, 02.50.Ey 


I. INTRODUCTION 

While sports are often analogized to a wide array of 
other arenas of human activity—particularly war—well 
known story lines and elements of sports are conversely 
invoked to describe other spheres. Each game generates 
a probablistic, rule-based story [T], and the stories of 
games provide a range of motifs which map onto nar¬ 
ratives found across the human experience: dominant, 
one-sided performances; back-and-forth struggles; under¬ 
dog upsets; and improbable comebacks. As fans, people 
enjoy watching suspenseful sporting events—unscripted 
stories—and following the fortunes of their favorite play¬ 
ers and teams US]. 

Despite the inherent story-telling nature of sporting 
contests—and notwithstanding the vast statistical anal¬ 
yses surrounding professional sports including the many 
observations of and departures from randomness pro— 
the ecology of game stories remains a largely unexplored, 
data-rich area m, We are interested in a number of ba¬ 
sic questions such as whether the game stories of a sport 
form a spectrum or a set of relatively isolated clusters, 
how well models such as random walks fare in reproduc¬ 
ing the specific shapes of real game stories, whether or 
not these stories are compelling to fans, and how differ¬ 
ent sports compare in the stories afforded by their various 
rule sets. 

Here, we focus on Australian Rules Football, a high 
skills game originating in the mid 1800s. We describe 
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Australian Rules Football in brief and then move on to 
extracting and evaluating the sport’s possible game sto¬ 
ries. Early on, the game evolved into a winter sport 
quite distinct from other codes such as soccer or rugby 
while bearing some similarity to Gaelic football. Played 
as state-level competitions for most of the 1900s with 
the Victorian Football League (VFL) being most promi¬ 
nent, a national competition emerged in the 1980s with 
the Australian Football League (AFL) becoming a for¬ 
mal entity in 1990. The AFL is currently constituted by 
18 teams located in five of Australia’s states. 

Games run over four quarters, each lasting around 30 
minutes (including stoppage time), and teams are each 
comprised of 18 on-field players. Games (or matches) are 
played on large ovals typically used for cricket in the sum¬ 
mer and of variable size (generally 135 to 185 meters in 
length). The ball is oblong and may be kicked or hand- 
balled (an action where the ball is punched off one hand 
with the closed fist of the other) but not thrown. Mark¬ 
ing (cleanly catching a kicked ball) is a central feature 
of the game, and the AFL is well known for producing 
many spectacular marks and kicks for goals [13| . 

The object of the sport is to kick goals, with the cus¬ 
tomary standard of highest score wins (ties are relatively 
rare but possible). Scores may be 6 points or 1 point as 
follows, some minor details aside. Each end of the ground 
has four tall posts. Kicking the ball (untouched) through 
the central two posts results in a ‘goal’ or 6 points. If the 
ball is touched or goes through either of the outer two 
sets of posts, then the score is a ‘behind’ or 1 point. Final 
scores are thus a combination of goals (6) and behinds 
(1) and on average tally around 100 per team. Poor con¬ 
ditions or poor play may lead to scores below 50, while 
scores above 200 are achievable in the case of a ‘thrash¬ 
ing’ (the record high and low scores are 239 and 1). Wins 
are worth 4 points, ties 2 points, and losses 0. 
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der of the paper. In Secs. IV and [Vj we demonstrate 
that game stories form a spectrum rather than distinct 
clusters, and we apply coarse-graining to elucidate game 
story motifs at two levels of resolution. We then pro¬ 
vide a detailed comparison between real game motifs and 
the smaller taxonomy of motifs generated by our biased 
random walk null model. We explore the possibility of 
predicting final game margins in Sec. m We offer clos¬ 
ing thoughts and propose further avenues of analysis in 
Sec. HL 


Figure 1. Representative ‘game story’ (or score differential 
‘worm’) for an example AFL contest held between Geelong 
and Hawthorn on Monday April 21, 2014. Individual scores 
are either goals (6 points) or behinds (1 point). Geelong won 
by 19 with a final score line of 106 (15 goals, 16 behinds) to 
87 (12 goals, 15 behinds). 


Of interest to us here is that the AFL provides 
an excellent test case for extracting and describing 
the game story space of a professional sport. We 
downloaded 1,310 AFL game scoring progressions from 
http://afltables.com (ranging from the 2008 season to 
midway through the 2014 season) m- We extracted 
the scoring dynamics of each game down to second level 
resolution, with the possible events at each second being 
(1) a goal for either team, (2) a behind for either team, 
or (3) no score [15;. Each game thus affords a ‘worm’ 
tracking the score differential between two teams. We 
will call these worms ‘game stories’ and we provide an 
example in Fig. [l] The game story shows that Geelong 
pulled away from Hawthorn—their great rival over the 
preceding decade—towards the end of a close, back and 
forth game. 

Each game story provides a rich representation of a 
game’s flow, and, at a glance, quickly indicates key as¬ 
pects such as largest lead, number of lead changes, mo¬ 
mentum swings, and one-sidedness. And game stories 
evidently allow for a straightforward quantitative com¬ 
parison between any pair of matches. 

For the game story ecology we study here, an impor¬ 
tant aspect of the AFL is that rankings (referred to as 
the ladder), depend first on number of wins (and ties), 
and then percentage of ‘points for’ versus ‘points against’. 
Teams are therefore generally motivated to score as heav¬ 
ily as possible while still factoring in increased potential 
for injury. 

We order the paper as follows. In Sec. [TT} we first 
present a series of basic observations about the statis¬ 
tics of AFL games. We include an analysis of condi¬ 
tional probabilities for winning as a function of lead size. 
We show through a general comparison to random walks 
that AFL games are collectively more diffusive than sim¬ 
ple random walks leading to a biased random walk null 
model based on skill differential between teams. We then 
introduce an ensemble of 100 sets of 1,310 biased random 
walk game stories which we use throughout the remain- 


II. BASIC GAME FEATURES 

A. Game length 

While every AFL game is officially comprised of four 
20 minute quarters of playing time, the inclusion of stop¬ 
page time means there is no set quarter or game length, 
resulting in some minor complications for our analysis. 
We see an approximate Gaussian distribution of game 
lengths with the average game lasting a little over two 
hours at 122 minutes, and 96% of games run for around 
112 to 132 minutes (a ~ 4.8 minutes). In comparing 
AFL games, we must therefore accommodate different 
game lengths. A range of possible approaches include di¬ 
lation, truncation, and extension (by holding a final score 
constant), and we will explain and argue for the latter in 
Sec. EVl 

B. Scoring across quarters 

In post-game discussions, commentators will often fo¬ 
cus on the natural chapters of a given sport. For quarter- 
based games, matches will sometimes be described as ‘a 
game of quarters’ or ‘a tale of two halves.’ For the AFL, 
we find that scoring does not, on average, vary greatly as 
the game progresses from quarter to quarter (we will how¬ 
ever observe interesting quarter-scale motifs later on). 
For our game database, we find there is slightly more 
scoring done in the second half of the game (46.96 versus 
44.91), where teams score one more point, on average, in 
the fourth quarter versus the first quarter (23.48 versus 
22.22). This minor increase may be due to a heightened 
sense of the importance of each point as game time be¬ 
gins to run out, the fatiguing of defensive players, or as 
a consequence of having ‘learned an opponent ’ pang. 

C. Probability of next score as a function of lead 

size 

In Fig. [2j we show that, as for a number of other sports, 
the probability of scoring next (either a goal or behind) 
at any point in a game increases linearly as a function of 
the current lead size (the National Basketball Association 
is a clear exception) [HH - H211 171 . This reflects a kind of 
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Figure 2. Conditional probability of scoring the next goal 
or behind given a particular lead size. Bins are in six point 
blocks with the extreme leads collapsed: < -72, -71 to -66, ..., 
-6 to -1, 1 to 6, 7 to 12, ..., > 72. As for most sports, the 
probability of scoring next increases approximately linearly 
as a function of current lead size. 


momentum gain within games, and could be captured by 
a simple biased model with scoring probability linearly 
tied to the current lead. Other studies have proposed 
this linearity to be the result of a heterogeneous skill 
model [IS], an d, as we describe in the following section, 
we use a modification of such an approach. 


D. Conditional probabilities for winning 

We next examine the conditional probability of win¬ 
ning given a lead of size t at a time point 1 in a game, 
Pt (Winning 1 1). We consider four example time points— 
the end of each of the first three quarters and with 10 
minutes left in game time—and plot the results in Fig. [3] 
We fit a sigmoid curve (see caption) to each conditional 
probability. As expected, we immediately see an increase 
in winning probability for a fixed lead as the game pro¬ 
gresses. 

These curves could be referenced to give a rough indi¬ 
cation of an unfolding game’s likely outcome and may be 
used to generate a range of statistics. As an example, we 
define likely victory as P(Winning | £) > 0.90 and find £ 
= 32, 27, 20, and 11 are the approximate corresponding 
lead sizes at the four time points. Losing games after 
holding any of these leads might be viewed as ‘snatching 
defeat from the jaws of victory.’ 

Similarly, if we define close games as those with 
P(Winning | £) < 0.60, we find the corresponding ap¬ 
proximate lead sizes to be £ ~ 6, 5, 4, and 2. These 
leads could function in the same way as the save statistic 
in baseball is used, i.e., to acknowledge when a pitcher 
performs well enough in a close game to help ensure their 
team’s victory. Expanding beyond the AFL, such prob¬ 
ability thresholds for likely victory or uncertain outcome 
may be modified to apply to any sport, and could be 
greatly refined using detailed information such as recent 
performances, stage of a season, and weather conditions. 



Figure 3. Conditional probability of winning given a lead of 
size l at the end of the first three quarters (A—C) and with 
10 minutes to go in the game (D). Bins are comprised of the 
aggregate of every 6 points as in Fig. [2] The dark blue curve is 
a sigmoid function of the form [1 + g-^-h))]- 1 where k and 
lo are fit parameters determined via standard optimization 
using the Python function scipy.optimize.curve_fit (Note that 
£o should be 0 by construction.) As a game progresses, the 
threshold for likely victory (winning probability 0.90, upper 
red lines) decreases as expected, as does a threshold for a close 
game (probability of 0.60, lower red line). The slope of the 
sigmoid curve increases as the game time progresses showing 
the evident greater impact of each point. We note that the 
missing data in panel A is a real feature of the specific 1,310 
games in our data set. 

III. RANDOM WALK NULL MODELS 

A natural null model for a game story is the classic, 
possibly biased, random walk [10 US]. We consider an 
ensemble of modified random walks, with each walk (1) 
composed of steps of ± 6 and ± 1, (2) dictated by a 
randomly drawn bias, (3) running for a variable total 
number of events, and (4) with variable gaps between 
events, all informed by real AFL game data. For the 
purpose of exploring motifs later on, we will create 100 
sets of 1,310 games. 

An important and subtle aspect of the null model is 
the scoring bias, which we will denote by p. We take 
the bias for each game simulation to be a proxy for the 
skill differential between two opposing teams, as in [12j . 
though our approach involves an important adjustment. 

In [12] , a symmetric skill bias distribution is generated 
by taking the relative number of scoring events made by 
one team in each game. For example, given a match 
between two teams Tj and T 2 , we find the number of 
scoring events generated by Ti, ni, and the same for T 2 , 
n 2 . We then estimate a posteriori the skill bias between 
the two teams as: 
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Figure 4. Skill bias p represents a team’s relative ability to 
score against another team and is estimated a posteriori by 
the fraction of scoring events made by each team Eq. (jTJ). A 
and B: Kolmogorov-Smirnov test D statistic and associated 
p-value comparing the observed output skill bias distribution 
produced by a presumed input skill distribution / with that 
observed for all AFL games in our data set, where / is Gaus¬ 
sian with mean 0.5 and its standard deviation a is the vari¬ 
able of interest. For each value of a, we created 1,000 biased 
random walks with the bias p drawn from the correspond¬ 
ing normal distribution. Each game’s number of events was 
drawn from a distribution of the number of events in real AFL 
games (see text). Plot B is an expanded version of the shaded 
region in A with finer sampling. We estimated the best fit 
to be a ~ 0.088, and we compare the resulting observed bias 
distribution with that of m m Fig- i 


In constructing the distribution of p, /(p), we discard in¬ 
formation regarding how specific teams perform against 
each other over seasons and years, and we are thus only 
able to assign skill bias in a random, memoryless fashion 
for our simulations. We also note that for games with 
more than one value of points available for different scor¬ 
ing events (as in 6 and 1 for Australian Rules Football), 
the winning team may register less scoring events than 
the losing one. 

In [T2] , random walk game stories were then generated 
directly using /(p). However, for small time scales this is 
immediately problematic and requires a correction. Con¬ 
sider using such an approach on pure random walks. We 
of course have that /(p) = <5(p—1/2) by construction, but 
our estimate of /(p) will be a Gaussian of width ~ t -1 / 2 , 
where we have normalized displacement by time t. And 
while as t —> oo, our estimate of /(p) approaches the 
correct distribution 5(p— 1/2), we are here dealing with 
relatively short random walks. Indeed, we observe that 


if we start with pure random walks, run them for, say, 
100 steps, estimate the bias distribution, run a new set of 
random walks with these biases, and keep repeating this 
process, we obtain an increasingly flat bias distribution. 

To account for this overestimate of the spread of skill 
bias, we propose the tuning of an input Gaussian dis¬ 
tribution of skill biases so as to produce biased random 
walks whose outcomes best match the observed event bi¬ 
ases for real games. We assume that / should be centered 
at p = 0.50. We then draw from an appropriate distribu¬ 
tion of number of events per game, and tune the standard 
deviation of /, cr, to minimize the Kolmogorov-Smirnov 
(KS) D statistic and maximize the p -value produced from 
a two-tailed KS test between the resulting distribution of 
event biases and the underlying, observed distribution for 
our AFL data set. 

We show the variation of D and the p -value as a func¬ 
tion of a in Fig. [4j We then demonstrate in Fig. [5] that 
the er-corrected distribution produces an observably bet¬ 
ter approximation of outcomes than if we used the ob¬ 
served biases approach of m • Because the fit for our 
method in Fig. [5] is not exact, a further improvement 
(unnecessary here) would be to allow / to be arbitrary 
rather than assuming a Gaussian. 

With a reasonable estimate of / in hand, we create 100 
ensembles of 1,310 null games where each game is gener¬ 
ated with (1) one team scoring with probability p drawn 
from the cr-corrected distribution described above; (2) 
individual scores being a goal or behind with probabili¬ 
ties based on the AFL data set (approximately 0.53 and 
0.47); and (3) a variable number of events per simulation 
based on: (a) game duration drawn from the approxi¬ 
mated normal distribution described in Sec. |ITJ and (b) 
time between events drawn from a Chi-squared distribu¬ 
tion fit to the inter-event times of real games. 

For a secondary test on the validity of our null model’s 
game stories, we compute the variance cr 2 of the mar¬ 
gin at each event number n for both AFL games and 
modified random walks (for the AFL games, we orient 
each walk according to home and away status, the de¬ 
fault ordering in the data set). As we show in Fig. [b| 
we find that both AFL games and biased random walks 
produce game stories with a 2 ~ n 1 - 239 ±° oo9 anc [ a i ^ 
n i.236±o.oi2 respectively. Collectively, AFL games thus 
have a tendency toward runaway score differentials, and 
while superdiffusive-like, this superlinear scaling of the 
variance can be almost entirely accounted for by our in¬ 
corporation of the skill bias distribution /. 


IV. MEASURING DISTANCES BETWEEN 
GAMES 

Before moving on to our main focus, the ecology of 
game stories, we define a straightforward measure of the 
distance between any pair of games. For any sport, we 
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Figure 5. Comparison of the observed AFL skill bias distri¬ 
bution (balance of scoring events p given in Eq. |l]), dashed 
blue curve) with that produced by two approaches: (1) We 
draw p from a normal distribution using the best candidate 
a value with mean 0.50 as determined via Fig. [4] (red curve), 
and (2) We choose p from the complete list of observed bi¬ 
ases from the AFL (green curve, the replication method of 
nil)- For the real and the two simulated distributions, both 
p and 1 — p are included for symmetry. The fitted cr approach 
produces a more accurate estimate of the observed biases, 
particularly for competitive matches (p close to 0.50) and one 
sided affairs. Inset: Upper half of the distributions plotted 
on a semi-logarithmic scale (base 10) revealing that the repli¬ 
cation method of m also over produces extreme biases, as 
compared to the AFL and our proposed correction using a 
numerically determined cr. 



Figure 6. Variance in the instantaneous margin as a function 
of event number for real AFL games (s olid red curve) and bi¬ 
ased random walks as described in Sec. |III| (solid blue curve). 
We perform fits in logarithmic space using standard least 
squares regression (solid black curve for real games, dashed 
black for the null model). The biased random walks satis¬ 
factorily reproduce the observed scaling of variance. It thus 
appears that AFL games stories do not exhibit inherently 
superdiffusive behavior but rather result from imbalances be¬ 
tween opposing teams. 


define a distance measure between two games i and j as 

T 

D(gi,g 0 ) = T- 1 Y I gi(t) - gj(t )\, (2) 

t=i 

where T is the length of the game in seconds, and gi(t) 
is the score differential between the competing teams in 
game i at second t. We orient game stories so that the 
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Figure 7. Top ten pairwise neighbors as determined by the 
distance measure between each game described by Eq. |2j. 
In all examples, dark gray curves denote the game story. For 
the shorter game of each pair, horizontal solid blue lines show 
how we hold the final score constant to equalize lengths of 
games. 
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team whose score is oriented upwards on the vertical axis 
wins or ties [i.e., <?i(T) > 0]. By construction, pairs of 
games which have a relatively small distance between 
them will have similar game stories. The normalization 
factor 1/T means the distance remains in the units of 
points and can be thought of as the average difference be¬ 
tween point differentials over the course of the two games. 

In the case of the AFL, due to the fact that games do 
not run for a standardized time T, we extend the game 
story of the shorter of the pair to match the length of the 
longer game by holding the final score constant. While 
not ideal, we observe that the metric performs well in 
identifying games that are closely related. We investi¬ 
gated several alternatives such as linearly dilating the 
shorter game, and found no compelling benefits. Dila¬ 
tion may be useful in other settings but the distortion of 
real time is problematic for sports. 

In Fig. [TJ we present the ten most similar pairs of 
games in terms of their stories. These close pairs show the 
metric performs as it should and that, moreover, proxi¬ 
mal games are not dominated by a certain type. Figs. UK 
and 03 demonstrate a team overcoming an early stum¬ 
ble, Figs. 03 and 0? showcase the victor repelling an at¬ 
tempted comeback, Figs. 03 and 03 exemplify a see-saw 
battle with many lead changes, and Fig. 03 and 0T cap¬ 
ture blowouts—one team taking control early and con¬ 
tinuing to dominate the contest. 


V. GAME STORY ECOLOGY 

Having described and implemented a suitable metric 
for comparing games and their root story, we seek to 
group games together with the objective of revealing 
large scale characteristic motifs. To what extent are well- 
known game narratives—from blowouts to nail-biters to 
improbable comebacks—and potentially less well known 
story lines featured in our collection of games? And 
how does the distribution of real game stories compare 
with those of our biased random walk null model? (We 
note that in an earlier version of the present paper, 
we considered pure, unbiased random walks for the null 
model [H].) 


A. AFL games constitute a single spectrum 

We first compute the pairwise distance between all 
games in our data set. We then apply a shuffling algo¬ 
rithm to order games on a discretized ring so that similar 
games are as close to each other as possible. Specifically, 
we minimize the cost 

C= £ dl-Dfagj)- 1 (3) 

where is the shortest distance between i and j on the 
ring. At each step of our minimization procedure, we 
randomly choose a game and determine which swap with 








cycled, shuffled game index 


Figure 8. Heat maps for (A) the pairwise distances between 
games unsorted on a ring; (B) the same distances after games 
have been reordered on the ring so as to minimize the cost 
function given in Eq. (J3|; (C) the same as (B) but with game 
indices cycled to make the continuous spectrum of games ev¬ 
ident. We include only every 20th game for clarity and note 
that such shuffling is usually performed for entities on a line 
rather than a ring. The games at the end of the spectrum 
are most dissimilar and correspond to runaway victories and 
comebacks (see also Fig. [9j|. 


another game most reduces C. We use by choice and 
other powers give similar results. 

In Fig. [8j we show three heat maps for distance D 
with: (A) games unsorted; (B) games sorted according 
to the above minimization procedure; and (C) indices of 
sorted games cycled to reveal that AFL games broadly 
constitute a continuous spectrum. As we show below, at 
the ends of the spectrum are the most extreme blow outs, 
and the strongest comebacks—i.e., one team dominates 
for the first half and then the tables are flipped in the 
second half. 


B. Coarse-grained motifs 

While little modularity is apparent—there are no evi¬ 
dent distinct classes of games—we may nevertheless per- 
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Figure 9. Heat matrix for the pairwise distances between 
games, subsampled by a factor of 20 as per Fig. [8] A notice¬ 
able split is visible between the blowout games (first six clus¬ 
ters) and the comeback victories (last three clusters). We plot 
dendrograms along both the top and left edges of the matrix, 
and as explained in Sec. |V C[ the boxed numbers reference 
the 18 motifs found when the average intra-cluster distance 
is set to 11 points. These 18 motifs are variously displayed in 
Figs. [12] and |13| 


form a kind of coarse-graining via hierarchical clustering 
to extract a dendrogram of increasingly resolved game 
motifs. 

Even though we have just shown that the game story 
ecology forms a continuum, it is important that we stress 
that the motifs we find should not be interpreted as well 
separated clusters. Adjacent motifs will have similar 
game stories at their connecting borders. A physical ex¬ 
ample might be the landscape roughness of equal area re¬ 
gions dividing up a country—two connected areas would 
typically be locally similar along their borders. Having 
identified a continuum, we are simply now addressing the 
variation within that continuum using a range of scales. 

We employ a principled approach to identifying mean¬ 
ingful levels of coarse-graining, leading to families of mo¬ 
tifs. As points are the smallest scoring unit in AFL 
games, we use them to mark resolution scales as follows. 
First, we define pi, the average distance between games 
within a given cluster i as 

^ rii rii 

ft = .!) £ £ w 

Here j and k are games placed in cluster i. rii is the 
number of games in cluster i, and D is the game distance 
defined in Eq. p]). At a given depth d of the dendrogram, 
we compute Pi[d) for each of the N(d) clusters found, 
and then average over all clusters to obtain an average 



Figure 10. Average intra-cluster distance ( p) as a function 
of cluster number N. Red lines mark the first occurrence in 
which the average of the intra cluster distance of the N motif 
clusters had a value below 12, 11, 10, 9, and 8 (red text beside 
each line) points respectively. The next cut for 7 points gives 
343 motifs. 


intra-cluster distance: 


N(d) 

{p{d)) = N(d) ^ P{d) ■ (5) 

' ' i= 1 


We use Ward’s method of variance to construct a den¬ 
drogram [20], as shown in Fig. [9j Ward’s method aims 
to minimize the within cluster variance at each level of 
the hierarchy. At each step, the pairing which results in 
the minimum increase in the variance is chosen. These 
increases are measured as a weighted squared distance 
between cluster centers. We chose Ward’s method over 
other linkage techniques based on its tendency to produce 
clusters of comparable size at each level of the hierarchy. 

At the most coarse resolution of two categories, we see 
in Fig. [9] that one sided contests are distinguished from 
games that remain closer, and repeated analysis using k- 
means clustering suggests the same presence of two major 
clusters. 

As we are interested in creating a taxonomy of more 
particular, interpretable game shapes, we opt to make 
cuts as ( p(d )) first falls below an integer number of points, 
as shown in Fig.[l0](we acknowledge that (p(d)) does not 
perfectly decrease monotonically). As indicated by the 
red vertical lines, average intra-cluster point differences 
of 12, 11, 10, 9, and 8 correspond to 9, 18, 30, 71, and 157 
distinct clusters. Our choice, which is tied to a natural 
game score, has a useful outcome of making the number 
of clusters approximately double with every single point 
in average score differential. 
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Number of Motifs 


Figure 11. Histograms of the number of motifs produced 
by 100 ensembles of 1,310 games using the random walk null 
model, and evaluating at 11 and 9 point cutoffs (A and B) 
as described in Sec. m For real games, we obtain by com¬ 
parison 18 and 71 motifs (vertical red lines in A and B), 
which exceeds all 100 motif numbers in both cases and indi¬ 
cates AFL game stories are more diverse than our null model 
would suggest. 


C. Taxonomy of 18 motifs for real AFL games 

In the remainder of section [Vj we show and explore in 
some depth the taxonomies provided by 18 and 71 motifs 
at the 11 and 9 point cutoff scales. 

We first show that for both cutoffs, the number of mo¬ 
tifs produced by the biased random walk null model is 
typically well below the number observed for the real 
game. In Fig. ED we show histograms of the number of 
motifs found in the 100 ensembles of 1,310 null model 
games with the real game motif numbers of 18 and 71 
marked by vertical red lines. The number of random 
walk motifs is variable with both distributions exhibiting 
reasonable spread, and also in both cases, the maximum 
number of motifs is below the real game’s number of mo¬ 
tifs. These observations strongly suggest that AFL gen¬ 
erates a more diverse set of game story shapes than our 
random walk null model. 

We now consider the 18 motif characterization which 
we display in Fig. [12] by plotting all individual game 
stories in each cluster (light gray curves) and overlay¬ 
ing the average motif game story (blue/gray/red curves, 
explained below). 

All game stories are oriented so that the winning team 
aligns with the positive vertical axis, i.e., gi(T) > 0 (in 
the rare case of a tie, we orient the game story randomly), 
and motifs are ordered by their final margin (descending). 
In all presentations of motifs that follow, we standard¬ 
ize final margin as the principle index of ordering. We 
display the final margin index in the top center of each 
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Figure 12. Eighteen game motifs as determined by per¬ 
forming hierarchical clustering analysis and finding when the 
average intra-cluster game distance ( p) first drops below 11 
points. In each panel, the main curves are the motifs—the 
average of all game stories (shown as light gray curves in 
background) within each cluster, and we arrange clusters in 
order of the motif winning margin. All motifs are shown with 
the same axis limits. Numbers of games within each cluster 
are indicated in the bottom right corner of each panel along 
with the average number of the nearest biased random walk 
games (normalized per 1,310). Motif colors correspond to rel¬ 
ative abundance of real versus random game ratio R as red: 
R > 1.1; gray: 0.9 < R < 1.1; and blue: R < 0.9. See Fig.[l3| 
for the same motifs reordered by real game to random ratio. 


motif panel to ease comparisons when motifs are ordered 
in other ways (e.g., by prevalence in the null model). We 
can now also connect back to the heat map of Fig. [9] 
where we use the same indices to mark the 18 motifs. 
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In the bottom right corner of each motif panel, we 
record two counts: (1) the number of real games belong¬ 
ing to the motif’s cluster; and (2) the average number of 
our ensemble of 100 x 1,310 biased random walk games 
(see Sec. |III[ ) which are closest to the motif according 
to Eq. (|2|. For each motif, we compute the ratio of real 
to random adjacent game stories, R , and, as a guide, we 
color the motifs as 


• red if R > 1.1 (real game stories are more abun¬ 
dant); 

• gray if 0.9 < R < 1.1 (counts of real and random 
game stories are close); and 

• blue if R < 0.9 (random game stories are more 
abundant). 


We immediately observe that the number of games 
falling within each cluster is highly variable, with 
only 3 in the most extreme blowout motif (#1, 
Fig. [T2K/Fig. I 13 K) and 169 in a gradual-pulling-away 
motif(#8, Fig L jT§I/Fig.[l3^). 

The average motif game stories in Fig. [12] provide us 
with the essence of each cluster, and, though they do not 
represent any one real game, they are helpful for the eye 
in distinguishing clusters. Naturally, by applying further 
coarse-graining as we do below, we will uncover a richer 
array of more specialized motifs. 

Looking at Figs. |T2] and [l3j we now clearly see a con¬ 
tinuum of game shapes ranging from extreme blowouts 
(motif #1) to extreme comebacks, both successful (motif 
#17) and failed (motif #18). We observe that while some 
motifs have qualitatively similar story lines, a game mo¬ 
tif that has a monotonically increasing score differential 
that ends with a margin of 200 (#1) is certainly different 
from one with a final margin of 50 (#6). 

In considering this induced taxonomy of 18 game mo¬ 
tifs, we may interpret the following groupings: 


• #l-#6, #8: One-sided, runaway matches; 

• #9: Losing early on, coming back, and then pulling 
away; 

• #7 and #10: Initially even contests with one side 
eventually breaking away; 

• #11 and #12: One team taking an early lead and 
then holding on for the rest of the game; 

• #13, #14, and #16: Variations on tight contests; 

• #15 and #17: Successful comebacks; 

• #18: Failed comebacks. 


We note that the game stories attached to each motif 
might not fit these descriptions—we are only categoriz¬ 
ing motifs. As we move to finer grain taxonomies, the 
neighborhood around motifs diminishes and the connec¬ 
tion between the shapes of motifs will become increas¬ 
ingly congruent with its constituent games. 
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Figure 13. Real game motifs for an 11 point cut off as per 
Fig.[l2]but reordered according to decreasing ratio of adjacent 
real to biased random games, R, and with closest biased ran¬ 
dom walk rather than real game stories plotted underneath 
in light gray. See the caption for Fig. [12] for more details. 


The extreme blowout motif for real games has 
relatively fewer adjacent random walk game stories 
(Fig. Jl3]4), as do the two successful comeback motifs 
(Fig. [T3p and Fig. [lffi), and games with a lead devel¬ 
oped by half time that then remains stable (Fig. |T3)f). 
A total of 5 motifs show a relatively even balance be¬ 
tween real and random (i.e., within 10%) including two 
of the six motifs with the tightest finishes (Figs. [Id) I and 
|13| Q. Biased random walks most overproduce games in 
which an early loss is turned around strongly (Fig. [T3]Cj) 
or an early lead is maintained (Fig. [13^). In terms of 
game numbers behind motifs, we find a reasonable bal¬ 
ance with 603 (46.0%) having R > 1.1 (7 motifs), 430 
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(32.8%) with 0.9 < R < 1.1 (5 motifs), and 277 (21.1%) 
with R < 0.9 (6 motifs). 

Depending on the point of view of the fan and again 
at this level of 18 motifs, we could argue that certain real 
AFL games that feature more often that our null model 
would suggest are more or less ‘interesting’. For example, 
we see some dominating wins are relatively more abun¬ 
dant in the real game (#1, #2, and #4). While such 
games are presumably gratifying for fans of the team 
handing out the ‘pasting’, they are likely deflating for 
the supporters of the losing team. And a neutral ob¬ 
server may or may not enjoy the spectacle of a superior 
team displaying their prowess. Real games do exhibit 
relatively more of the two major comeback motifs (#15 
and #17)—certainly exciting in nature—though less of 
the failed comebacks (#18). 


D. Taxonomy of 71 motifs for real AFL games 


Increasing our level of resolution corresponding to an 
average intra-cluster game distance of (p) = 9, we now 
resolve the AFL game story ecology into 71 clusters. We 
present all 71 motifs in Figs. [14] and [l5j ordering by final 
margin and real-to-random game story ratio R respec¬ 
tively (we will refer to motif number and Fig. 15 so read¬ 


ers may easily connect to the orderings in both figures). 
With a greater number of categories, we naturally see a 
more even distribution of game stories across motifs with 
a minimum of 1 (Motif #1, Fig. |]~5]4C) and a maximum 
of 48 (Motif #43, Fig. [l5]4H). 

As for the coarser 18 motif taxonomy, we again observe 
a mismatch between real and biased random walk games. 
For example, motif #14 (Fig. [l5]AF) is an average of 25 
real game stories compared with on average 15.13 adja¬ 
cent biased random walks while motif #20 (Fig. [l5|CS) 
has i?=10/22.67. Using our 10% criterion, we see 25 mo¬ 
tifs have R > 1.1 (representing 553 games or 42.2%), 23 
have 0.9 < R < 1.1 (420 games, 32.0%), and the remain¬ 
ing 23 have R < 0.9 (337 games, 25.7%). Generally, we 
again see blowouts are more likely in real games. How¬ 
ever, we also find some kinds of comeback motifs are 
also more prevalent (R > 1.1) though not strongly in 
absolute numbers; these include the failed comebacks in 
motifs #67 (Fig. |I5^D) and #71 (Fig. |~i~5}4K), and the 
major comeback in motif #64 (Fig. |T5]A llj. 

In Fig. 16 we give summary plots for the 18 and 71 
motif taxonomies with motif final margin as a function 
of the of the real-to-random ratio R. The larger final 
margins of the blowout games feature on the right of 
these plots (R > 1.1), and, in moving to the left, we see 
a gradual tightening of games as shapes become more 
favorably produced by the random null model (R < 0.9). 
The continuum of game stories is also reflected in the 


basic similarity of the two plots in Fig. 16 made as they 
are for two different levels of coarse-graining. 

Returning to Figs. [Ti] and |~i~5] we highlight ten exam¬ 
ples in both reinforcements and refinements of motifs seen 


at the 18 motif level. We frame them as follows (in order 
of decreasing R and referencing Fig. 15): 


• Fig. |T5}4 1 >. #64 (R = 11/5.71): The late, great 
comeback; 


Fig. 15 AE, #71 (i? = 7/4.00): The massive come¬ 
back that just falls short; 


• Fig.|~i~5]A). #52 (R = 29/19.80): comeback over the 
first half connecting into a blowout in the second 
(the winning team may be said to have ‘Turned the 
corner’); 


• Fig.[l5]A.M, Motif #13 (R = 32/23.33): an exem¬ 
plar blowout (and variously a shellacking, thrash¬ 
ing, or hiding); 


• Fig. [15J4.X, #55 (R = 26/23.16): Rope-a-dope 
(taking steady losses and then surging late); 

• Fig. [h5^Z, #68 (R = 7/8.05): Hold-slide-hold- 
surge; 


• Fig. [l5|CD, #56 (R = 12/14.69): See-saw battle; 

• Fig. [l5pK, #62 (R = 19/26.26): The tightly 
fought nail-biter (or heart stopper); 

• Fig.[l5pP, #50 (R = 15/28.25): Burn-and-hold (or 
the game-manager, or the always dangerous playing 
not-to-lose); 


• Fig.[l5pQ, #36 (R = 9/17.19): Surge-slide-surge. 


These motifs may also be grouped according to the num¬ 
ber of ‘acts’ in the game. Motif #53 (Fig. [TsJao) , for ex¬ 
ample, is a three-act story while motifs #56 (Fig. [L5pD) 
and #68 (Fig. [l5)3Z) exhibit four acts. We invite the 
reader to explore the rest of the motifs in Fig. [15} 


VI. PREDICTING GAMES USING SHAPES OF 
STORIES 

Can we improve our ability to predict the outcome of 
a game in progress by knowing how games with similar 
stories played out in the past? Does the full history of a 
game help us gain any predictive power over much sim¬ 
pler game state descriptions such as the current time and 
score differential? In this last section, we explore predic¬ 
tion as informed by game stories, a natural application. 

Suppose we are in the midst of viewing a new game. 
We know the game story g D b s from the start of the game 
until the current game time t < T a b s , where T a b s is the 
eventual length of game (and is another variable which we 
could potentially predict). In part to help with presenta¬ 
tion and analysis, we will use minute resolution (mean¬ 
ing t = 60 n for n = 0,1, 2,...). Our goal is to use our 
database of completed games- for which of course we 
know the eventual outcomes—to predict the final margin 
of our new game, g 0 bs(T 0 bs)- 
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ict game motifs as determined by hierarchical clustering analysis with a threshold of nine points, 
g. [lO] and described in Sec. [V] Motifs are ordered by their final margin, highest to lowest, and 
i the background of each motif. Cutoffs for motif colors red, gray, and blue correspond to real- 
and the top number indicates motif rank according to final margin. The same process applied to 
1 for our 100 simulations typically yields only 45 to 50 motifs (see Fig. |11|). We discuss the ten 
l text and note that we have allowed the vertical axis limits to vary. 





















































































































































































































50 

0 

.00 

50 

0 

-50 

.00 

50 

0 

-50 

60 

40 

20 

0 

-20 

.00 

50 

0 

-50 

.00 

50 

0 

-50 

60 

40 

20 

0 

-20 

.50 

.00 

50 

0 

60 

40 

20 

0 

-20 

60 

40 

20 

0 

-20 

60 

40 

20 

0 

-20 

40 

20 

0 

-20 

-40, 



AU 

4 

V 


6/5.18 






BF 

18 \~ 

24/22.62 








Motifs (red curves) from Fig. [f4] rearranged in order of descending ratio of the number of real games to the number 
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Figure 16. Final margin of motifs as a function of real-to- 
random ratio R for real AFL games at the 18 and 71 motif 
levels, panels A and B respectively, with linear fits. On the 
right of each plot, extreme blowout motifs ending in high mar¬ 
gins have no or relatively few adjacent random walks, (red, 
R> 1.1). On the left, game stories are more well represented 
by random walks (blue, R < 0.9). There is considerable vari¬ 
ation however, particularly in the 71 motif case, and we cer¬ 
tainly see some close finishes with R > 1 (e.g., the massive 
comeback, motif #71, Fig. |15f \E). 

We create a prediction model with two parameters: (1) 
N: the desired number of analog games closest to our 
present game; and (2) M: the number of minutes going 
back from the current time for which we will measure 
the distance between games. For a predictor, we simply 
average the final margins of the N closest analog games 
to (fobs over the interval [t — 60 M,t]. That is, at time t, 
we predict the final margin of g Q bs, F , using M minutes 
of memory and N analog games as: 

F(g ohs ,t/60,M,N) = ^ ]T g l (T i ), (6) 

2£n(g o bs,£/60,M,jV) 

where 9 ((fobs, i/60, M, N) is the set of indices for the N 
games closest to the current game over the time span 
[t — 60 M,t\, and T) is the final second of game i. 

For an example demonstration, in Fig. |17[ we attempt 
to predict the outcome of an example game story given 
knowledge of its first 60 minutes (red curve) and by find¬ 
ing the average final margin of the N = 50 closest games 
over the interval 45 to 60 minutes (M = 15, shaded gray 
region). Most broadly, we see that our predictor here 



Game time (minutes) 

Figure 17. Illustration of our prediction method given in 
Eq. §. We start with a game story (fobs (red curve) for which 
we know up until, for this example, 60 minutes (t = 3600). 
We find the N = 50 closest game stories based on matching 
over the time period 45 to 60 minutes (memory M = 15), and 
show these as gray curves. We indicate the average final score 
F(g 0 bs, i/60, M, N) for these analog games with the horizontal 
blue curve. 

would correctly call the winning team. At a more de¬ 
tailed level, the average final margin of the analog games 
slightly underestimates the final margin of the game of in¬ 
terest, and the range of outcomes for the 50 analog games 
is broad with the final margin spanning from around -40 
to 90 points. 

Having defined our prediction method, we now system¬ 
atically test its performance after 30, 60, and 90 minutes 
have elapsed in a game currently under way. In aiming to 
find the best combination of memory and analog number, 
M and N, we use Eq. © to predict the eventual winner of 
all 1,310 AFL games in our data set at these time points. 
First, as should be expected, the further a game has pro¬ 
gressed, the better our prediction. More interestingly, in 
Fig. [18] we see that for all three time points, increasing 
N elevates the prediction accuracy, while increasing M 
has little and sometimes the opposite effect, especially 
for small N. The current score differential serves as a 
stronger indicator of the final outcome than the whole 
game story shape unfolded so far. The recent change 
in scores—momentum—is also informative, but to a far 
lesser extent than the simple difference in scores at time 
t. 

Based on Fig. [l8j we proceed with N = 50 analogs 
and two examples of low memory: M = 1 and M = 10. 
We compare with the naive model that, at any time t, 
predicts the winner as being the current leader. 

We see in Fig. [T9] that there is essentially no differ¬ 
ence in prediction performance between the two methods. 
Thus, memory does not appear to play a necessary role 
in prediction for AFL games. Of interest going forward 
will be the extent to which other sports show the same 
behavior. For predicting the final score, we also observe 
that simple linear extrapolation performs well on the en¬ 
tire set of the AFL games (not shown). 

Nevertheless, we have thus far found no compelling ev¬ 
idence for using game stories in prediction, nuanced anal- 
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Figure 18. Fraction of games correctly predicted using the average final score of N analog games, with adjacency evaluated over 
the last M minutes at the three game times of A. 30, B. 60, and C. 90 minutes. Increasing the number of analogs provides the 
strongest benefit for prediction while increasing memory either degrades or does not improve performance. Because prediction 
improves as a game is played out, the color bars cover the same span of accuracy (0.06) but with range translated appropriately. 



Figure 19. Prediction accuracy using the described game 
shape comparison model using N = 50 analogs and a mem¬ 
ories of M = 1 (blue curve) and M = 10 (green curve), 
compared with the naive model of assuming that the current 
leader will ultimately win (red curve). 

yses incorporating game stories for AFL and other pro¬ 
fessional sports may nevertheless yield substantive im¬ 
provements over these simple predictive models |21j . 

VII. CONCLUDING REMARKS 

Overall, we find that the sport of Australian Rules 
Football presents a continuum of game types ranging 
from dominant blowouts to last minute, major come¬ 
backs. Consequently, and rather than uncovering an op¬ 
timal number of game motifs, we instead apply coarse- 
graining to find a varying number of motifs depending on 
the degree of resolution desired. 

We further find that (1) A biased random walk af¬ 
fords a reasonable null model for AFL game stories; (2) 
The scoring bias distribution may be numerically deter¬ 
mined so that the null model produces a distribution of 
final margins which suitably matches that of real games; 
(3) Blowout and major comeback motifs are much more 
strongly represented in the real game whereas tighter 
games are generally (but not entirely) more favorably 


produced by a random model; and (4) AFL game mo¬ 
tifs are overall more diverse than those of the random 
version. 

Our analysis of an entire sport through its game story 
ecology could naturally be applied to other major sports 
such as American football, Association football (soccer), 
basketball, and baseball. A cross-sport comparison for 
any of the above analysis would likely be interesting and 
informative. And at a macro scale, we could also ex¬ 
plore the shapes of win-loss progressions of franchises 
over years [22] . 

It is important to reinforce that a priori, we were un¬ 
clear as to whether there would be distinct clusters of 
games or a single spectrum, and one might imagine rough 
theoretical justifications for both. Our finding of a spec¬ 
trum conditions our expectations for other sports, and 
also provides a stringent, nuanced test for more advanced 
explanatory mechanisms beyond biased random walks, 
although we are wary of the potential difficulty involved 
in establishing a more sophisticated and still defensible 
mechanism. 

Finally, a potentially valuable future project would be 
an investigation of the aesthetic quality of both individ¬ 
ual games and motifs as rated by fans and neutral indi¬ 
viduals [53]. Possible sources of data would be (1) social 
media posts tagged as being relevant to a specific game, 
and (2) information on game-related betting. Would true 
fans rather see a boring blowout with their team on top 
than witness a close game [31 HI? Is the final margin the 
main criterion for an interesting game? To what extent 
do large momentum swings engage an audience? Such 
a study could assist in the implementation of new rules 
and policies within professional sports. 
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